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Please cancel claims 1-5, amend claim 6 and add claims 7 and 8 as follows: 

6. (Amended) An improved speech recognition system comprising: 

a speech recognizer; and 

a source normalization model derived by application of an estimation maximization algorithm with 
explicit separation of source information and environment distortion factors in an unsupervised manner 

7. A recognition system comprising: 
an input signal; 

a stored reference; 

and a comparator for comparing said input signal with said stored reference; 

said stored reference has at least two components, one representing signal source and the other 
representing transformations for a number of environments; 

said reference is, identified by source normalization training, which consists in performing the 
following steps of: 

(a) determining a new set of signal source representation or at least part of the representation that 
reduces the distance between the new reference and a training signal, given training signals and 
current transformations and 

(b) for each environment, determine a new transformation or at least part of it that, jointly with the 
signal source representation, reduces the distance between the new reference and the training 
signal where said environment represents either a label associated with training signal or a class of 
distortion sources. 



8. A method of speech recognition comprising the steps of: 

providing a stored reference having two components with one representing signal sources 
and the other representing transformations for a number of environments; said reference being 
identified by source normalization training which consists of performing the steps of determining a 
new set of signal source representations that reduces the distance between the new reference and the 
training signal, given training signals and current transformations and for each environment, 
determining a new transformation that jointly with the signal source representation reduces the distance 
between the new reference and the training signal where said environment represents either a label 
associated with a training signal or a class of distortion sources; 

receiving an input signal; and 

comparing said input signal with said stored reference. 
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SOURCE NORMALIZATION TRAINING FOR HMM MODELING OF 

SPEECH 

TECHNICAL FIELD OF THE INVENTION 

This invention relates to training for HMM modeling of speech and more 
particularly to removing environmental factors from speech signal during the training 
5 procedure. 

BACKGROUND OF THE INVENTION 

In the present application we refer to environment as speaker, handset or 
0 microphone, transmission channel, noise background conditions, or combination of these as 
the environment. A speech signal can only be measured in a particular environment. 
Speech recognizers suffer from environment variability because trained model distributions 
may be biased from testing signal distributions because environment mismatch and trained 
model distributions are flat because they are averaged over different environments. 

5 

The first problem, the environmental mismatch, can be reduced through model 
adaptation, based on some utterances collected in the testing environment. To solve the 
second problem, the environmental factors should be removed from the speech signal 
during the training procedure, mainly by source normalization. 
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In the direction of source normalization, speaker adaptive training uses linear 
regression (LR) solutions to decrease inter-^peaker variability. See for example, T. 
Anastasakos, et al. entitled, "A compact model for speaker-adaptive training," International 
Conference on Spoken Language Processing, Vol. 2, October 1996. Another technique 
models mean-vectors as the sum of a speaker-independent bias and a speaker-dependent 
vector. This is found in A. Acero, et al. entitled, "Speaker and Gender Normalization for 
Continuous-Density Hidden Markov Models," in Proc. Of IEEE International Conference 
on Acoustics, Speech and Signal Processing, pages 342-345, Atlanta, 1996. Both of these 
techniques require explicit label of the classes. For example, speaker or gender of the 
utterance during the training. Therefore, they can not be used to train clusters of classes, 
which represent acoustically close speaker, hand set or microphone, or background noises. 
Such inability of discovering clusters may be a disadvantage in application. 
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SUMMARY OF THE INVENTION 



In accordance with one embodiment of the present invention, we provide a 
maximum likelihood (ML) linear regression (LR) solution to the environment 
normalization problem, where the environment is modeled as a hidden (non-observable) 
variable. An EM-Based training algorithm can generate optimal clusters of environments 
and therefore it is not necessary to label a database in terms of environment. For special 
cases, the technique is compared to utterance-by-utterance cepstral mean normalization 
(CMN) technique and show performance improvement on a noisy speech telephone 
database. 

In accordance with one embodiment of the present invention under maximum- 
likelihood (ML) criterion, by application of EM algorithm and extension of Baum-Welch 
forward and backward variables and algorithm, we obtained joint solution to the parameters 
for the source normalization, i.e. the canonical distributions, the transformations and the 
biases. 

These and other features of the invention that will be apparent to those skilled in the 
art from the following detailed description of the invention, taken together with the 
accompanying drawings. 
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DESCRIPTION OF THE DRAWINGS 



Fig. 1 is a block diagram of the system according to one embodiment of the present 
invention; 

Fig. 2 illustrates a speech model; 

Fig. 3 illustrates a Gaussian distribution; 

Fig. 4 illustrates distortions in the distribution caused by different environments; 
Fig. 5 is a more detailed flow diagram of the process according to one embodiment 
of the present invention; and 

Fig. 6 is a recognizer according to an embodiment of the present invention using a 
source normlization model. 
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DESCRIPTION OF PREFERRED EMBODIMENTS OF THE PRESENT 
INVENTION 

The training is done on a computer workstation which is illustrated in Fig. 1 
having a monitor 11, a computer workstation 13, a keyboard 15, and a mouse or other 
interactive device 15a as shown in Fig. 1. The system maybe connected to a separate 
database represented by database 17 in Fig. 1 for storage and retrieval of models. 

By the term "training" we mean herein to fix the parameters of the speech models 
according to an optimum criterion. In this particular case, we use HMM (Hidden Markov 
Models) models. These models are as represented in Fig. 2 with states A, B, and C and 
transitions E, F, G, H, I and J between states. Each of these states has a mixture of 
Gaussian distributions 18 represented by Fig. 3. We are training these models to account 
for different environments. By environment we mean different speaker, handset, 
transmission channel, and noise background conditions. Speech recognizers suffer from 
environment variability because trained model distributions may be biased from testing 
signal distributions because of environment mismatch and trained model distributions are 
flat because they are averaged over different environments. For the first problem, the 
environmental mismatch can be reduced through model adaptation, based on utterances 
collected in the testing environment. Applicant's teaching herein is to solve the second 
problem by removing the environmental factors from the speech signal during the 
training procedure. This is source normalization training according to the present 
invention. A maximum likelihood (ML) linear regression (LR) solution to the 
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environmental problem is provided herein where the environment is modeled as hidden- 
(non observable) variable. 

A clean speech pattern distribution 40 will undergo complex distortion with 
different environments as shown in Fig. 4. The two axes represent two parameters which 
may be, for example, frequency, energy, formant, spectral, or cepstral components. The 
Fig. 4 illustrates a change at 41 in the distribution due to background noise or a change in 
speakers. The purpose of the application is to model the distortion. 

The present model assumes the following: 1) the speech signal x is generated by 
Continuous Density Hidden Markov Model (CDHMM), called source distributions; 2) 
before being observed, the signal has undergone an environmental transformation, drawn 
from a set of transformations, where W je be the transformation on the HMM state j of the 
environment e; 3) such a transformation is linear, and is independent of the mixture 
components of the source; and 4) there is a bias vector b ke at the k-th mixture component 
due to environment e. 

What we observe at time t is: 

o t = W je x t + b ke (1) 

Our problem now is to find, in the maximum likelihood (ML) sense, the optimal 
source distributions, the transformation and the bias set. 
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In the prior art (A. Acero, et al. cited above and T. Anastasakos, et al. cited 
above), the environment e must be explicit, e.g.: speaker identity, male/female. This 
work overcomes this limitation by allowing an arbitrary number of environments which 
are optimally trained. 

Let N be the number of HMM states, M be the mixture number, L be the number 
of environments, Q s A {1, 2, ... N} be the set of states Q m A {1, 2, ... M} be the set of 
mixture indicators, and Q e A {1, 2, ... L} be the set of environmental indicators. 

For an observed speech sequence of T vectors: O A o[ A (o„ o 2 , ... o T ), we 
introduce state sequence 0 A {0 O , ... 9 T ) where 0 t 6 Q s , mixture indicator sequence E 
A (£i> - £t) where ^ t e Q m , and environment indicator sequence OA (9,, ... <p T ) where cp, 
e Q e . They are all unobservable. Under some additional assumptions, the joint 
probability of 0, 0, 5, and O given model X can be written as: 

/KO,0,S,<D|4) = ^flc di ^(o,)a 9A J^ (2) 
where v 

b jke (o, ) A p(o t \0 t = j,£ = k, cp = e, X) 
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= N(o t ;W. eMjk +b ke ,Y Jjk \ (4) 
«/ kPW = 0. ' ^ A = yjtf, = /) (5) 

c jk b.p($ = k\e t =j,i), i.Lp(<p=4x) (6) 

Referring to Fig. 1, the workstation 13 including a processor contains a program 
as illustrated that starts with an initial standard HMM model 21 which is to be refined by 
estimation procedures using Baum- Welch or Estimation-Maximization procedures 23 to 
get new models 25. The program gets training data at database 19 under different 
environments and this is used in an iterative process to get optimal parameters. From this 
model we get another model 25 that takes into account environment changes. The 
quantities are defined by probabilities of observing a particular input vector at some 
particular state for a particular environment given the model. 

The model parameters can be determined by applying generalized EM-procedure 
with three types of hidden variables: state sequence, mixture component indicators, and 
environment indicators. (A. P. Dempster, N. M. Laird, and D. B. Rubin, entitled 
"Maximum Likelihood from Incomplete Data via the EM Algorithm," Journal of the 
Royal Statistical Society, 39 (1): 1-38, 1977.) For this purpose, Applicant teaches the 
CDHMM formulation from B, Juang, "Maximum-Likelihood Estimation for Mixture 
Multivariate Stochastic Observation of Markov Chains" {The Bell System Technical 



TI-25489 (Page 8) 



Journal, pages 1235-1248, July-August 1985) to be extended to result in the following 
paragraphs: Denote: 

<* t (j, e)A p{o[ ,0 t =j,(p = e\I) (7) 
Pt (Jy e) A p(o? +l \& t =j,<p = e, X) (8) 
r t (j,k,e)Ap(0 t = =k,<p = e\0,X) (9) 

The speech is observed as a sequence of frames (a vector). Equations 7, 8, and 9 
are estimations of intermediate quantities. For example, in equation 7 is the joint 
probability of observing the frames from times 1 to t at the state j at time t and for the 
environment of e given the model X. 

The following re-estimation equations can be derived from equations 2, 7, 8, and 

9. 

For the EM procedure 23, equations 10-21 are solutions for the quantities in the 

model. 

Initial state probability: 

with R the number of training tokens. 
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Transition probability: 



r=\ eeQ_ t=l 

a v= i ^ (11) 

It^II<(^)#o>) 

r=l esQ, f=l 



Mixture Component probability: (Mixture probability is where there is a mixture of 
Gaussian distributions) 



c * = 1 '~? (12) 



Z^iS)ZZ<c/,«)/ra«) 

Environment probability: 

I = 1 y Z^ f g fO» 



r=I r= 



zz 

and 



(13) 



Mean vector and bias vector: We introduce: 

R T 

PUXe)^E^f t {j,k,e)o r t (14) 



gU,k,e)AZXr;(J,k,e) (15) 



G fe = 2>(y,*,e)X;i (16) 
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E jhe =g(j,k,e)Wj e 'Y J 'jl (17) 

F Jk = H E JkeW je (18) 

^ = Z^'Z>c/>M (19) 

<* = ££;* /KM,*). (20) 



Assuming W je = PF, e and Z~/ k = Z~/ k , for a given k, we have N+L equations: 



T i E j* b * +F jkPjk= a j* V)eQ j (21) 

eeQ e 

Mjk — C ke 

VeeQ e (22) 



These equations 21 and 22 are solved jointly for mean vectors and bias vectors. 

Therefore ja jk and b ke can be simultaneously obtained by solving the linear system 
of N+L variables. 

Covariance: 



V Z«*.Z~i£ ij:U,k,e)5;{j,k,e)S;{j,e,k)' 

1,*= v — m w 
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where 8 r t (J, k, e) Ao, r - W je n jk - b u 

Transformation: We assume covariance matrix to be diagonal: J] ~^ mjl) = 0 ifn*m. 
For the line m of transformation W je , we can derive (see for example C. J. Leggetter, et 
al., entitled "Maximum Likelihood Linear Regression for Speaker Adaptation of 
Continuos Density HMMs" Computer, Speech and Language, 9(2): 171-185, 1995.): 

Z p = W Je ™R je (m) (24) 
which is a linear system of D equations, where: 

z}t'>a Sir-VtEr/C/,*,^; -bj* (25) 

^ (m)A X Z?" V' V* Z £ r r t U, k, e). (26) 

r=l r=l 

Assume the means of the source distributions (|a jk ) are constant, then the above set of 
source normalization formulas can also be used for model adaptation. 

The model is specified by the parameters. The new model is specified by the new 
parameters. 
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As illustrated in Figs. 1 and 5, we start with an initial as standard model 21 such 
as the CDHMM model with initial values. This next step is the Estimation Maximization 
23 procedure starting with (Step 23a) equations 7-9 and re-estimation (Step 23b) 
equations 10-13 for initial state probability, transition probability, mixture component 
probability and environment probability. 

The next step (23c) to derive means vector and bias vector by introducing two 
additional equations 14 and 15 and equation 16-20. The next step 23a is to apply linear 
equations 21 and 22 and solve 21 and 22 jointly for mean vectors and bias vectors and at 
the same time calculate the variance using equation 23. Using equation 24 which is a 
system of linear equations will solve for transformation parameters using quantities given 
by equation 25 and 26. Then we have solved for all the model parameters. Then one 
replaces the old model parameters by the newly calculated ones (Step 24). Then the 
process is repeated for all the frames. When this is done for all the frames of the database 
a new model is formed and then the new models are re-evaluated using the same equation 
until there is no change beyond a predetermined threshold (Step 27). 

After a source normalization training model is formed, this model is used in a 
recognizer as shown in Fig. 6 where input speech is applied to a recognizer 60 which 
used the source normalized HMM model 61 created by the above training to achieve the 
response. 
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The recognition task has 53 commands of 1-4 words, ("call return", "cancel call 
return", "selective call forwarding", etc.). Utterances are recorded through telephone 
lines, with a diversity of microphones, including carbon, electret and cordless 
5 microphones and hands-free speaker-phones. Some of the training utterances do not 
conrespond to their transcriptions. For example: "call screen" (cancel call screen), "matic 
call back" (automatic call back), "call tra" (call tracking). 

The speech is 8kHz sampled with 20ms frame rate. The observation vectors are 
10 composed of LPCC (Linear Prediction Coding Coefficients) derived 13-MFCC (Mel- 

Scale Cepstral Coefficients) plus regression based delta MFCC. CMN is performed at 
the utterance level There are 3505 utterances for training and 720 for speaker- 
independent testing. The number of utterances per call ranges between 5-30. 

15 Because of data sparseness, besides transformation sharing among states and 

mixtures, the transformations need to be shared by a group of phonetically similar 
phones. The grouping, based on an hierarchical clustering of phones, is dependent on the 
amount of training (SN) or adaptation (AD) data, i.e., the larger the number of tokens is, 
the larger the number of transformations. Recognition experiments are run on several 

20 system configurations: 
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BASELINE applies CMN utterance-by-utterance. This simple technique will remove 
channel and some long term speaker specificities, if the duration of the utterance is 
long enough, but can not deal with time domain additive noises. 

SN performs source-normalized HMM training, where the utterances of a phone-call 
are assumed to have been generated by a call-dependent acoustic source. Speaker, 
channel and background noise that are specific to the call is then removed by MLLR. 
An HMM recognizer is then applied using source parameters. We evaluated a special 
case, where each call is modeled by one environment. 

AD adapts traditional HMM parameters by unsupervised MLLR. 1. Using current 
HMMs and task grammar to phonetically recognize the test utterances, 2. Mapping 
the phone labels to a small number (N) of classes, which depends on the amount of 
data in the test utterances, 3. Estimating the LR using the N-classes and associated 
test data, 4. Recognizing the test utterances with transformed HMM. A similar 
procedure has been introduced in C. J. Legetter and P. C. Woodland. "Maximum 
likelihood linear regression for speaker adaptation of continuous density HMMs." 
Computer, Speech and Language, 9(2);171-185, 1995. 

SN+AD refers to AD with initial models trained by SN technique. 
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Based on the results summarized in Table 1, we point out: 

• For numbers of mixture components per state smaller than 16, SN, AD, and SN+AD 
all give consistent improvement over the baseline configuration. 

• For numbers of mixture components per state smaller than 16, SN gives about 10% 
error reduction over the baseline. As SN is a training procedure which does not 
require any change to the recognizer, this error reduction mechanism immediately 
benefits applications. 

• For all tested configurations, AD using acoustic models trained with SN procedure 
always gives additional error reduction. 

• The most efficient case of SN+AD is with 32 components per state, which reduces 
error rate by 23%, resulting 4.64% WER on the task. 





4 


8 


16 


32 


baseline 


7.85 


6.94 


6.83 


5.98 


SN 


7.53 


6.35 


6.51 


6.03 


AD 


7.15 


6.41 


5.61 


5.87 


SN+AD 


6.99 


6.03 


5.41 


4.64 



Table 1: Word error rate (%) as function of test configuration and number of mixture 
components per state. 

Although the present invention and its advantages have been described in detail, it 
should be understood that various changes, substitutions and alterations can be made herein 
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without departing from the spirit and scope of the invention as defined by the appended 
claims. 
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WHAT IS CLAIMED IS: 



1. A method of source normalization training for HMM modeling of speech 
comprising the steps of: 

(a) providing an initial model; 

(b) on said initial model or following new models performing the following 
steps to get a new model: 

estimation of intermediate quantities; 
b 2 ) performing re-estimation to determine initial state probability, 
transition probability, mixture component probability and environment 
probability; 

b 3 ) deriving mean vector and bias vector; 
* b 4 ) solving jointly for mean vector and bias vector using linear equations 
and determining variances and transformation; and 
b 5 ) replacing old model parameters for the calculated ones; and 
c) determining after a new model is formed if it differs significantly from the 
previous model and if so repeating steps bj - b 5 . 

2. The method of Claim 1 wherein in step b x estimation intermediate 
quantities is determined by a t (j, e)A p{o[ , 6> = j,<p = e\A ) , 

P t U,e)kp{ol\8 t = y> = e,I),and y t {j,k,e)kp(9 t = j,£ = k,<p = e|0,X) . 



TI-25489 (Page 18) 



3. 



The method of Claim 2 wherein step b 2 the initial state probability is- 



10 



1 R Y r 

determined by — Ay — v^' ^ > transition probability is determined by 



r r 

Z^ZI>;('>)/n<>) 

r=l eeO, f=l 



mixture component probability is 



determined by c jk=~ R ^ ^ , and environment 

Z^ZZ<o>wo>) 



probability is determined by / = — Y Z^*f</*> 



4. The method of Claim 2 wherein step b 3 deriving mean vector and bias 

r r r r 

vector is determined by p(j 9 k 9 e)A^ Z f (J, k * e )°\ \ #0" > K e )AZ Z ft U, Ke), 

r=i r=l r=l t=] 

G u = XgUM'Z ;* > E jke = gijXeWje^ , F jk = ^E jk JV ]e , 



5. The method of Claim 2 wherein step b 4 equations 
Z E jke b *e+ F jk Mjt = v 7 e and G te Z> fa + Z ^ = c ke e Q e are used 

f , . . . t1 . t . v Z ^ e Z ttZ jj r t UXe)S;{j,k,e)S; {j,e,ky 
for solving jointly and equation ^ . = 5 = — 
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is used to determine variance and equations Zjf = W je {m) R je (m) , ' 
3r n) A ItXT^^ttrtU^eM -bj" , and 

r=l /=1 / 

R)^ E V" V^Z Z r;U,Ke). are used to determine transformation. 



6. An improved speech recognition system comprising: 
a speech recognizer; and 

a source normalization model derived by application of an estimation 
maximization algorithm. 



TI-25489 (Page 20) 



ABSTRACT OF THE DISCLOSURE 



A maximum likelihood (ML) linear regression (LR) solution to environment 
normalization is provided where the environment is modeled as a hidden (non-observable) 
variable. By application of an expectation maximization algorithm and extension of Baum- 
Welch forward and backward variables (Steps 23a-23d) a source normalization is achieved 
such that it is not necessary to label a database in terms of environment such as speaker 
identity, channel, microphone and noise type. 
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date shown below and is addressed to the Assistant Commissioner for 
Patents, Washington, D.C. 20231. 



Robert L. Troike, Reg. No. 24, 183 



Date 



Sir: 



Enclosed are THREE (3) sheets of formal drawings for the above-referenced case. Please 
charge any necessary fees to Deposit Account No. 20-0668 of Texas Instruments Incorporated. 
This sheet is enclosed in triplicate. 



Texas Instruments Incorporated 
P.O. Box 655474, M/S 3999 
Dallas, TX 75265 
(202) 639-7710 



Respectfully submitted, 

Robert L. Troike 
Attorney for Applicant 
Reg. No. 24,183 
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ESTIMATION OF INTERMEDIATE QUANTITIES USING EQUATIONS 7-9 
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RE-ESTIMATION EQUATIONS 
INITIAL STATE PROBABILITY: 
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DERIVE MEAN VECTOR AND BIAS VECTOR EQUATIONS 14-20 
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SOLVE JOINTLY FOR MEAN VECTORS AND BIAS VECTORS USING LINEAR EQUATIONS 21 AND 22 
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AND DETERMINE COVARIANCES USING EQUATION 23 
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AND TRANSFORMATION USING EQUATIONS 24, 24, AND 26 
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REPLACE OLD MODEL PARAMETERS FOR THE CALCULATED ONES 
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ATTORNEY'S DOCKET NO. 
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APPLICATION FOR UNITED STATES PATENT 
DECLARATION AND POWER OF ATTORNEY 

As a below named inventor, I declare that my residence, post office address and citizenship are 
as stated below next to my name; that I verily believe that I am the original, first and sole 
inventor if only one name is listed below, or an original, first and joint inventor if plural 
inventors are named below, of the subject matter which is claimed and for which a patent is 
sought on the invention entitled as set forth below, and the title as set forth below which is 
described in the attached specification; that I have reviewed and understand the contents of the 
specification, including the claims, as amended by any amendment specifically referred to in the 
oath or declaration; that no application for patent or inventor's certificate on this invention 
has been filed by me or my legal representatives or assigns in any country foreign to the United 
States of America prior to the filing date of said application; and that I acknowledge my duty to 
disclose information which is material to the patentability of this application in accordance 
with Title 37, Code of Federal Regulations, section 1.56; 

I further declare that all statements made herein of my own knowledge are true and that all 
statements made on information and belief are believed to be true, and further that these 
statements were made with the knowledge that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under section 1001 of Title 18 of the United States 
Code, and that such willful false statements may jeopardize the validity of the application or 
any patent issuing thereon. 



TITLE OF INVENTION: 

M Source Normalizaztion Training for HMM Modeling of Speech 


P<b|ER OF ATTORNEY: I HEREBY APPOINT THE FOLLOWING ATTORNEYS TO PROSECUTE THIS APPLICATION AND TRANSACT 
\M ALL BUSINESS IN THE PATENT AND TRADEMARK OFFICE CONNECTED THEREWITH 

R<3&rt L. Troike, #24,183; Richard L. Donaldson, #25,673; Jay M. Cantor, #19,906; Rene E. Grossman, 
656, W. James Brady, III, #32,080; William B. Kempler, Reg. No. 28,228; Warren L. Franz, #28,716 


sfp CORRESPONDENCE TO: 

' Robert L. Troike 
~;L Texas Instruments Incorporated 
O P.O. Box 655474, MS 219 
llH Dallas, TX 752 65 


DIRECT TELEPHONE CALLS TO: 

Robert L. Troike 
972/995-1364 


NiWfe OF INVENTOR: (1) 
M Yifan Gong 


NAME OF INVENTOR: (2) 


NAME OF INVENTOR: (3) 


RESIDENCE & POST OFFICE ADDRESS: 

7750 Walnut Hill Lane #2107 
Dallas, Texas 75230 


RESIDENCE & POST OFFICE ADDRESS: 


RESIDENCE & POST OFFICE ADDRESS: 


COUNTRY OF CITIZENSHIP: 
France 


COUNTRY OF CITIZENSHIP: 


COUNTRY OF CITIZENSHIP: 


SIGNATURE OF INVENTOR: 


SIGNATURE OF INVENTOR: 


SIGNATURE OF INVENTOR: 


DATE: 
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DATE: 
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